Dependency-based Sentence Alignment for Multiple Document Summarization
نویسندگان
چکیده
In this paper, we describe a method of automatic sentence alignment for building extracts from abstracts in automatic summarization research. Our method is based on two steps. First, we introduce the “dependency tree path” (DTP). Next, we calculate the similarity between DTPs based on the ESK (Extended String Subsequence Kernel), which considers sequential patterns. By using these procedures, we can derive one-to-many or many-to-one correspondences among sentences. Experiments using different similarity measures show that DTP consistently improves the alignment accuracy and that ESK gives the best performance.
منابع مشابه
Sentence Similarity based on Dependency Tree Kernels for Multi-document Summarization
We introduce an approach based on using the dependency grammar representations of sentences to compute sentence similarity for extractive multi-document summarization. We adapt and investigate the effects of two untyped dependency tree kernels, which have originally been proposed for relation extraction, to the multi-document summarization problem. In addition, we propose a series of novel depe...
متن کاملAnnotating a parallel monolingual treebank with semantic similarity relations
We describe an ongoing effort to build a large-scale parallel and comparable monolingual treebank for Dutch of 1 million words, where nodes of dependency trees are aligned and labeled according to a limited set of semantic similarity relations. We address alignment of sentences and dependency trees, both manual and automatic. We introduce new annotation tools, present results from pilot experim...
متن کاملAnnotating a parallel monolingual treebank with semantic similarity relations
We describe an ongoing effort to build a large-scale parallel/comparable monolingual treebank for Dutch of 1 million words, where nodes of dependency trees are aligned and labeled according to a limited set of semantic similarity relations. We address alignment of sentences and dependency trees, both manual and automatic. We introduce new annotation tools, present results from pilot experiments...
متن کاملSingle Document Summarization based on Nested Tree Structure
Many methods of text summarization combining sentence selection and sentence compression have recently been proposed. Although the dependency between words has been used in most of these methods, the dependency between sentences, i.e., rhetorical structures, has not been exploited in such joint methods. We used both dependency between words and dependency between sentences by constructing a nes...
متن کاملKey Sentence Extraction from Single Document based on Triangle Analysis in Dependency Graph
Document summarization is a technique aimed to automatically extract main ideas from electronic documents. In this paper, we propose a novel algorithm, called TriangleSum for key sentence extraction from single document based on graph theory. The algorithm builds a dependency graph for the underlying document based on co-occurrence relation as well as syntactic dependency relations. The nodes r...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004